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Abstract. Using a sample of approximately 14,000 z > 2.1 quasars observed in the first year 
of the Baryon Oscillation Spectroscopic Survey (BOSS), we measure the three-dimensional 
correlation function of absorption in the Lyman-a forest. The angle- averaged correlation 
function of transmitted flux (F = e~ T ) is securely detected out to comoving separations 
of 60/i~ 1 Mpc, the first detection of flux correlations across widely separated sightlines. A 
quadrupole distortion of the redshift-space correlation function by peculiar velocities, the 
signature of the gravitational instability origin of structure in the Lyman-a forest, is also 
detected at high significance. We obtain a good fit to the data assuming linear theory redshift- 
space distortion and linear bias of the transmitted flux, relative to the matter fluctuations 
of a standard ACDM cosmological model (inflationary cold dark matter with a cosmological 
constant). At 95% confidence, we find a linear bias parameter 0.16 < b < 0.24 and redshift- 
distortion parameter 0.44 < (3 < 1.20, at central redshift z = 2.25, with a well constrained 
combination b (1+/3) = 0.336±0.012. The errors on /3 are asymmetric, with (3 = excluded at 
over 5o" confidence level. The value of f3 is somewhat low compared to theoretical predictions, 
and our tests on synthetic data suggest that it is depressed (relative to expectations for 
the Lyman-a forest alone) by the presence of high column density systems and metal line 
absorption. These results set the stage for cosmological parameter determinations from 
three-dimensional structure in the Lyman-a forest, including anticipated constraints on dark 
energy from baryon acoustic oscillations. 
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1 Introduction 

Early spectra of high-redshift (z > 2) quasars showed ubiquitous absorption lines blueward of 
their Lyman-a emission, which were identified as arising primarily from Lyman-a absorption 
by intervening concentrations of neutral hydrogen [1]. While early models described the 
absorbers as discrete clouds analogous to those in the interstellar medium, a combination of 
theoretical and observational advances in the mid-1990s led to a revised view of the forest 
as a continuous phenomenon, analogous to Gunn-Peterson [2] absorption but tracing an 
inhomogeneous, fluctuating intergalactic medium (see, e.g., the review by [3]). This revision 



- 1 - 



of theoretical understanding also transformed the promise of the Lyman-a forest as a tool 
for cosmology, by showing that the forest traces the distribution of dark matter in the high- 
redshift universe in a relatively simple way. Adopting this "continuous medium" view, several 
groups [4-9] measured the power spectrum of Lyman-a forest flux in successively larger 
samples of quasar spectra and used it to constrain the power spectrum of the underlying dark 
matter distribution. The most recent of these measurements [9], using a large quasar sample 
from the Sloan Digital Sky Survey (SDSS; [10-17]), have provided some of the strongest 
constraints on inflation, neutrino masses, and the "coldness" of dark matter, especially when 
combined with data sets that probe other redshifts and/or physical scales (e.g., [18-23]). 
However, while the underlying density field is three-dimensional, all of these cosmological 
analyses have treated the forest as a collection of independent 1-dimensional maps. 

This paper presents measurements of Lyman-a forest flux correlations across parallel 
lines of sight with the widest separation reached so far, taking advantage of the large sam- 
ple of high-redshift quasars observed during the first year of BOSS, the Baryon Oscillation 
Spectroscopic Survey of SDSS-III [24]. 

Measurements of correlated absorption towards gravitational lenses and closely sepa- 
rated quasar pairs [25-28] provided some of the critical observational evidence for the revised 
understanding of the Lyman-a forest. These transverse correlation measurements implied a 
coherence scale of several hundred /i _1 kpc (comoving) for individual absorbing structures. 
The large sizes and low neutral column densities in turn implied that the absorbing gas has 
low densities and is highly photoionized by the intergalactic UV background, making the 
total baryon content of the forest a large fraction of the baryons allowed by Big Bang nu- 
cleosynthesis [29]. Hydrodynamic cosmological simulations naturally explained these large 
coherence scales and many other statistical properties of the forest [30-33], with typical ab- 
sorption features arising in filamentary structures of moderate overdensity, 5 = p/p — 1 ~ 20. 
Detailed investigation of these simulations revealed an approximate description of the forest 
that is both surprisingly simple and surprisingly accurate [34, 35]: the Lyman-a forest arises 
in gas that traces the underlying dark matter distribution, except for small scale pressure 
support, with the neutral hydrogen fraction determined by photoionization equilibrium, and 
with a power-law temperature-density relation T oc (p/p) 7-1 . This relation is established by 
the competition of photoionization heating and adiabatic cooling [36]. In this approximation, 
the transmitted flux fraction F is related to the dark matter overdensity 5 by 



F = e T = exp 



,4(1 + ,5)2-0.7(7-1) ; (L1) 



where the temperature-density slope (7 — 1) depends on the inter-galactic medium (IGM) 
reionization history, and the constant A depends on redshift and on a variety of physical 
parameters (see [37] for a recent discussion). On large scales, the three-dimensional power 
spectrum of the field 5f = F/F — 1 should have the same shape as the power spectrum of 
5 = p/p — 1 [4, 5, 35]. Thermal motions of atoms smooth spectra on small scales, producing 
a turnover in the one-dimensional flux power spectrum at high k. (In practice, peculiar 
velocities, which we discuss next, produce a turnover at larger scales than thermal motions.) 

The most significant correction to equation (1.1) is from peculiar velocities, which shift 
the apparent locations of the absorbing neutral hydrogen in the radial direction. On large 
scales, the power spectrum should approximately follow the linear theory model of redshift- 
space distortions, 

P F (k,p k ) = b 2 P L (k)(l + Pp 2 k ) 2 , (1.2) 
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where Pi{k) is the real-space linear power spectrum and is the cosine of the angle between 
the wavevector k and the line of sight. The bias factor b of the forest is the bias factor of the 
contrast of the flux fluctuations and not the bias factor of the neutral hydrogen. It is typically 
low because the full range of density variations is mapped into the transmitted flux range 
< F < 1 (6 is actually technically negative, because overdensities in mass produce smaller 
F, but this is irrelevant to our paper so we will just quote \b\). While the functional form in 
Eq. 1.2 is that of Kaiser [38], the parameter (3 does not have the same interpretation, coming 
from a more general linear theory calculation of redshift-space distortions in the case where 
the directly distorted field, in this case optical depth, undergoes a subsequent non-linear 
transformation, in this case F = exp(— r) [5, 39]. For a galaxy survey, j3 « [f2 m (,z)] a55 /6, 
but for the Lyman-a forest f3 can vary independently of Q m and b, i.e., it is generally a free 
parameter. 

The simulations of [39] suggest an approximate value of /3 ~ 1.47 at z ~ 2.25 (inter- 
polating over the parameter dependences in Table 1 of [39], since the central model there 
is antiquated). Lower resolution calculations of [40] (500/i _1 kpc mean particle spacing and 
particle-mesh (PM) cell size) found a lower value j3 ~ 1. Unpublished work by Martin White 
demonstrates that the value of (3 predicted by PM simulations with different smoothing ap- 
plied to the mass density field decreases with increasing smoothing length (based simulations 
with 73 /i _1 kpc resolution, and in agreement with the ~ 180 /i _1 kpc resolution simulation in 
[41]). Reference [39] concurred that low resolution simulations produce lower (3, and agree 
with White on the value at similar smoothing, so the fundamental outstanding question for 
predicting f3 is apparently how smooth is the gas in the IGM ([39] quantifies other parameter 
dependences, but none of them make so much difference, given the relatively small uncer- 
tainty in those parameters, for vanilla models at least). The smoothing scale of the IGM 
is determined mostly by the pressure, i.e., Jeans smoothing, which is determined by the 
temperature history of the gas [42, 43]. A linear theory calculation of the smoothing scale 
for reasonable thermal history, following [42], predicts ~ 100/i _1 kpc or slightly more (this 
is the rms smoothing length for a Gaussian kernel applied to the density field). [39] used 
the hydro-PM (HPM) approximation of [42] to include pressure in the non-linear evolution 
of otherwise PM simulations and found results consistent with smoothing PM by a smaller 
amount, ~ 40 kpc. We know of no reason to doubt the qualitative accuracy of the HPM 
simulations, but ultimately fully hydrodynamic simulations will be required to decisively 
compute the expected value of (3 for a given model. [44] plotted the 3D power spectrum from 
a hydro simulation, and suggested (3 could be ~ 1, but their single 25/i _1 Mpc simulation 
box had too few modes in the comfortably linear regime to assess the accuracy of this result. 
Finally, even if we were sure of our calculation for a given thermal history, there is some un- 
certainty due to uncertainty about the thermal history, especially the redshift of reionization 
(we have not quantified how large this uncertainty is). 

At the percent level needed to interpret ID power spectrum measurements, the value of 
the bias parameter b is well known to depend on the mean absorption level (shown directly 
in [39]), which is difficult to determine accurately; however, at the level of this paper it 
is actually quite precisely predicted, with [39] and [41] (verified by Martin White, private 
communication) agreeing on a value ~ 0.13 to ~ 10% with less sensitivity to smoothing 
than for (3 ([40] did find a higher value ~ 0.18 for their much lower 500/i _1 kpc resolution 
simulation) . 

In addition to quasar pair measurements, there have been some studies of closely spaced 
groupings of quasars that provide hints of three-dimensional structure in the Lyman-a forest 
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Figure 1. Survey area of the quasars used in this paper in equatorial coordinates, in the Aitoff 
projection. 

[45-53]. However, large scale flux correlations are weak, so detecting them with high signifi- 
cance across widely separated sightlines requires a large and dense quasar sample. By design, 
the BOSS quasar survey provides precisely such a sample, probing a large comoving volume 
with ~ 15 sightlines per deg 2 . 

The ultimate goal of the BOSS survey is to measure the baryon acoustic oscillation 
(BAO) feature to high precision in the galaxy redshift survey at z < 0.7 and Lyman-a forest 
at z ~ 2.5 [24] (the possibility of measuring the BAO feature using the Lyman-a forest was 
suggested in [39] and some preliminary calculations were done by [54] , but the potential of the 
measurement was not really quantified until [55] (see also [40, 41, 56]). These measurements 
will be used as a standard ruler to measure the angular diameter distance Da(z) and Hubble 
parameter H(z). The first proof-of-concept paper for the galaxy part of the survey is [57], and 
this paper attempts to do the same for the Lyman-a forest part of the survey. The sample 
of 14,000 quasars analyzed in this paper is too small to yield a confident detection of the 
BAO feature, but it does allow the first precise measurements of three-dimensional structure 
in the Lyman-a forest. The basic statistic that we use is the flux correlation function 

Z F (r,fi) = (5 F ( X )5 F ( X + r)) , (1.3) 

which is the Fourier transform of the power spectrum (1.2). Our measurements provide novel 
tests of the basic cosmological understanding of high-redshift structure and the nature of the 
Lyman-a forest. 

The next section describes our data sample, based on quasars observed by BOSS be- 
tween December 2009 and July 2010. During these first few months of BOSS, which included 
commissioning of new hardware, software, and observing procedures, data taking was rela- 
tively inefficient. Our sample of 14,598 z > 2.1 quasars over ~ 880 square degrees is less 
than 10% of the anticipated final sample of 150,000 quasars selected over 10,000 deg 2 , but it 
is already comparable to the total number of z > 2.1 quasars found by SDSS-I and II [58]. 
Section 3 describes our methods for creating realistic synthetic data sets, which are crucial 
to testing our measurement and error estimation procedures and which we use as a source of 
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Figure 2. The geometry of the BOSS survey for a thin slice in the equatorial plane. Our galaxy 
is at the origin. The dark dots are the galaxies measured in the BOSS survey and the blue markers 
show the positions of quasars whose Lyman-a forests we use. The actual Lyman-a forest regions are 
shown in as red lines. Apparent differences between geometry of quasar and galaxy distribution arise 
from small differences in slice thickness and time-span. 



theoretical predictions for comparison to our measurements. Section 4 describes our analysis 
procedures including removal of continuum, estimation of the flux correlation function and 
its statistical errors, and model fitting, presenting detailed tests of these procedures on the 
synthetic data. Section 5 presents the main results of the BOSS analysis. We describe sev- 
eral internal tests for systematic errors in §6, and we summarize our results in §7 with final 
remarks given §8. 

2 Data sample 

In this section we give a brief overview of the BOSS Quasar data sample that was used in our 
analysis, and the French Participation Group (FPG) visual inspections of the BOSS spectra. 
The quasar target selection for the first year of BOSS observations is described in detail in 
[59], which draws together the various methods presented in [60, 61] and [62]. In summary, 
selecting z > 2.1, and in particular 2.1 < z < 3.5 objects, has always been challenging due 
to the colours of quasars at these redshifts crossing over the stellar locus in, e.g., SDSS ugriz 
optical colour-space [63]. Therefore, the "UV Excess" method, generally used for z < 2 
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Figure 3. Redshift distribution of quasars for the sample used in this analysis (left panel), and 
weighted distribution of Lyman-a forest pixel redshifts for the three redshift bins considered in this 
paper (right panel). The quasar redshifts are cut to be between 2.1 < z < 3.5. The pixel weights are 
limited by the UV coverage of the spectroscope at low redshift end and by the redshift of quasars at 
high redshift. We show the sample of quasars without DLA flag; however, the two plots are virtually 
identical for the sample in which DLA flagged quasars are included. 

objects, fails. As such, the 2.2 < z < 3.5 redshift range was sparsely sampled in the original 
SDSS, and the number density of z > 3.5 objects was very low (~ few deg~ 2 ). 

In BOSS, selecting 2.2 < z < 3.5 is key to meeting our science requirements. We have 
the advantage that when one selects the quasars only as backlights for the detection of neutral 
hydrogen, one is at liberty to use a combination of methods to select quasars without having 
to understand the completeness of the sample in any variable relevant for studying quasar 
properties or clustering. In total 54,909 spectra were taken by BOSS that were targeted as 
high, z > 2.2, quasar targets between Modified Julian Dates (MJDs) 55176 (11 Dec 2009) 
and 55383 (6 July 2010). From this sample, 52,238 are unique objects, and have SDSS PSF 
magnitudes in the range 18.0 < r < 22.0. Of these 52,238 objects, 13,580 (14,810) quasars 
have z > 2.20 (2.10) and a secure redshift as given by the BOSS spectro-pipeline. 

Although the BOSS spectro-pipeline is secure at the 90-95% level, a non-negligible 
number of objects are classified as QSOs when they are stars and vice-versa. Because of 
this, the nearly 55,000 objects were also visually inspected by one, or more, of the BOSS 
team members. These inspections are generally referred to as the FPG inspections. The 
manual inspections of the BOSS spectra give a further level of confidence in the data, and 
result in a net gain of real high-z quasars that the pipeline does not select, at the level of 
~ 1 — 3 objects per 7 deg 2 plate. Moreover, the visual inspection has the option to manually 
label high-z quasars which have either Damped Lyman-a (DLA) or Broad Absorption Line 
(BAL) features. Currently, this option is binary in the sense that the visual inspector has 
the option of setting the BAL and DLA flags but no attempt is made to measure A/hi 
column density or the position of the absorber. We have checked that the nagged DLAs have 
log N(& i)(cm~ 2 ) >~ 20.3. The list is not complete however, since DLAs are difficult to 
detect along lines of sight of SNR < 4 [64] . We have also checked that the so-called balnicity 
index is larger than 2000 km s _1 (see e.g. [65]). The removal of DLA systems in our data 
hence still depends on a human factor and signal-to-noise ratio and is currently not a very 
well quantified process. This will become better defined in future work. 
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For the analysis presented here, those objects labeled by the FPG visual inspections 
as "BAL" are not used, and we don't include objects labeled as "DLA" unless mentioned 
specifically. 

In our sample we restrict the analysis to quasars from the FPG sample with 2.1 < z < 
3.5. Compared to the targeted science sample described above, this sample has no magnitude 
cut, expands the lower redshift to z = 2.1 and uses some quasars that are on plates nominally 
marked as "bad" by the pipeline, but contain useful data based on manual inspections. After 
removing approximately 1300 quasars marked as BALs, our complete sample contains 14,598 
quasars, of which 13,743 are not labeled as having a DLA. Several quasars in the first year 
sample had repeated observations - we do not use these repeats and limit ourselves to the best 
quality observation (as determined by FPG). The FPG visual inspections are continuously 
being updated and revised. We froze the data used on 1 st October 2010. 

Further details about the target selection and details about the instrument and the 
observational procedure can be found in [59]. 

Figure 1 shows the position of the BOSS Year One quasars on the sky in equatorial 
projections in the Aitoff projection. Figure 2 shows the geometry of BOSS probes of large 
scale structure in comoving coordinates. Figure 3 gives the redshift distribution of the z > 
2.1 quasars and the weighted histogram of pixel redshifts contributing to our correlation 
measurement (explained and discussed later), as a function of redshift. 

3 Synthetic data 

To test our data reduction pipeline and to examine the impact of various systematics, we 
created extensive synthetic datasets. These synthetic data shared the geometry with the 
observed set of quasars, i.e. the positions and redshifts of quasars were identical, and we 
used the observed SDSS g magnitudes to normalize the spectra. 

The statistical properties of synthetic data matched our initial estimate of the properties 
of the underlying flux field, including the non-linear corrections to the linear redshift-space 
distortions [39]. 

The Lyman-a absorption spectra of each quasar were generated using a method de- 
scribed below to obtain the variable 5f = F/F — 1, where F is the fraction of transmitted 
flux, at each spectral pixel from a realization of the underlying cosmological density fluctua- 
tions, including linear redshift-space distortions and non-linear corrections to the flux power 
spectrum model that is used as input. A total of 30 different realizations of this underlying 
cosmological field were generated. For each realization, the following sets of synthetic data 
were created, with increasing level of realism to evaluate the impact of various observational 
effects and of our data reduction procedure: 

• Noiseless measurements of dp 

• Noiseless measurements of Lyman-a forest with continua 

• Noisy measurements of Lyman-a forest with continua 

• Noisy measurements of Lyman-a forest with continua and high-absorption systems 

• Noisy measurements of Lyman-a forest with continua and forest metal contamination 

• Noisy measurements of Lyman-a forest with continua and forest metal contamination 
and high-absorption systems 
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Note that continua here means variable continua using randomly generated PCA eigen- 
mode amplitudes [66], instead of the same mean continuum for every QSO. 

In the following subsections we briefly describe how these synthetic data are actually 
generated. The generation of Lyman-a absorption and associated structure will be discussed 
in more detail in [67]. 

3.1 Absorption field 

3.1.1 Generation of a Gaussian field 

A Lyman-a forest survey samples the intrinsically three-dimensional flux transmission frac- 
tion field along a set of infinitesimally thin lines of sight to quasars. The brute force method 
for generating a survey with correct correlations would be to generate the full 3D field with 
sufficient resolution (tens of kpc), and then read off the density along lines of sight. For a 
survey as large as ours, this requires too much memory. Fortunately, when generating a Gaus- 
sian field we can greatly simplify the problem, maintaining exact high-resolution statistics 
while only creating the field for the limited fraction of the volume where it is required. 

To generate a Gaussian random field for a general set of pixels with covariance matrix 
C we can Cholesky decompose the covariance matrix, i.e. first find L such that 

L L T = C. (3.1) 

If we then generate a unit variance white noise field Wj in the pixels and multiply it by L^j, 
the resulting field will have the desired correlation function, i.e. 

(SiSj) = (LikWkLjiwi) = LikLfzj = Cij. (3.2) 

Assuming the lines of sight are parallel allows another significant simplification. Suppose 
the field S (x\i,x±j has power spectrum P (ku, k±). If we Fourier transform this field in the 
radial direction only, the resulting modes have correlation 

(<5(A;||,x i )5(A ; f | ,x ± / ))=27r^(A ; ||+A ; f | )p x (A ; ||,|x ± -x i , |) , (3.3) 

where 

1 f°° 

Px {k h r ± ) = — kdk J {k ± r ± ) P{k,n k ) , (3.4) 

where k± = w k 2 — k 2 and ^ = k\\/k. The key point is that modes with different values 

of fcii are uncorrelated. We can generate the field efficiently by following the above general 
procedure for generating a correlated Gaussian random field, except now independently for 
every value of k\\ required for the radial Fourier transform. We never need to manipulate a 
matrix larger than a manageable N q x N q , where N q is the number of quasars. However, we 
do take into account the fact that lines of sight are not fully parallel as discussed in Section 
3.1.3. 

We use this procedure to generate any desired Gaussian field 5 at each pixel of each 
quasar spectrum of the synthetic data sets, once we have a model for the power spectrum 
P(k,fik) of this variable. 
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3.1.2 Generation of the transmitted flux fraction, F 

In order to obtain realistic synthetic data including noise as in the observed spectra, it is 
necessary to generate the flux transmission fraction F with a realistic distribution in the 
range to 1, and add the noise to this flux variable. The solution we have adopted is to 
convert the Gaussian fields 5 to a new field F(5) with any desired distribution. In this paper 
we use a log-normal distribution in the optical depth, r = — log F (where log denotes natural 
logarithm), which implies the following probability distribution for F: 

tF\J 2-ko t 

with the two parameters ro and a T that determine the mean and dispersion of F . We find 
the function F{8) that results in this distribution for F when 5 is a Gaussian variable. The 
correlation function of the variables 5 and F are then related by the following equation: 



Sf(tvl) = (FiF 2 ) 

-1 rl 

dF x / dF 2 p(F 1 ,F 2 )F 1 F 2 
) Jo 

/oo 
d5 2P (5 1 ,5 2 )F 1 F 2 
-oo 



g|+£|-2iij2jVl2) 

= / d6t / d5 2 —-===^F{5 l )F{5 2 ) . (3.6) 

We note that this equation relates £p to £, independently of the variables r, /i and z. 
Therefore, we simply tabulate £f(£)> an d i nver t the relation to figure out the correlation 
function £ that is required in order to obtain any desired correlation function £p. 

We use a model for the flux power spectrum in redshift space, Pjr(fc, fi^), which was fitted 
to the results of simulations in [39]. We use the value of /3 for the central model in that paper, 
f3 = 1.58, and a bias inspired by the central model but roughly adjusted for cosmological 
model (not using the parameter dependence Table in [39]), b = 0.145, at z = 2.25. When 
generating the synthetic data, we keep the value of /3 constant as a function of redshift, and 
vary the amplitude of the power spectrum according to a power-law, Pp oc (1 + z) a . We 
compute the correlation function £p from the Fourier transform, then we calculate £ for the 
Gaussian field, and we Fourier transform back to obtain the desired power spectrum for 5. 
We generate this Gaussian field in all the quasar spectra with the method described above, 
and then transform this to the variable 5f- In this way we obtain a set of spectra with the 
desired flux distribution and redshift-space power spectrum. 

We use mean transmission fraction approximately matching the observations [8]: In (F) (z) 
ln(0.8)[(l + z)/3.25] 3 - 2 . 

3.1.3 Redshift evolution and non-parallel lines of sight 

The field 6p is generated as described at a fixed redshift, and assuming parallel lines of sight. 
This is done at four different values of the redshift, by generating the realizations of the field 
with fixed random numbers for all the Fourier modes, and changing the amplitudes according 
to the different power spectra at every redshift. The power spectrum varies both because of 
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the intrinsic evolution, and because the angular diameter distance used to convert angular 
to physical separation between the quasars varies with redshift. In this way, the fact that 
the lines of sight are non-parallel is incorporated into the calculation in the same way as 
the redshift evolution of the power spectrum. The final field <5p is obtained by interpolating 
between the fields that have been generated at different redshifts, to the exact redshift of 
each point on the line of sight. This field then has the desired local power spectrum at any 
redshift, evolving as it is prescribed according to the linear interpolation. 

The redshift evolution of amplitude assumed in this work is described by a power law 
(1 + z) a , with a = 3.9 as suggested by the measured evolution of the ID flux power spectrum 
[9]. 

3.2 Adding absorption systems: Lyman Limit Systems (LLS) and Damped 
Lyman-a (DLA) systems 

The highest column density absorption systems produce broad damped absorption wings 
which can have a strong impact on the correlation function of the transmitted flux. These 
systems are traditionally known as Damped Lyman Alpha absorbers (DLA) and have H I 
column densities above 10 20 ' 3 cm -2 , however, the damping wings can affect the line profile at 
lower column densities as well. Systems with column densities above 10 17 ' 2 cm -2 are known 
as Lyman Limit Systems (LLS) since they are self-shielded [68]. At 10 172 cm~ 2 , the effect of 
damping wings on the profile is small, but it becomes significant well before 10 20 ' 3 cm~ 2 [69]. 

The impact of these absorption systems is two- fold. First, they add noise to the mea- 
surement of the correlation function. Second, the systems trace the underlying mass density 
field, and therefore they are correlated with themselves and with the Lyman-a absorption 
arising from the intergalactic medium. This systematically modifies the overall Lyman-a 
transmission correlation function. 

To simulate the effect of these systems in the synthetic data, we introduce lines with 
A^hi > 10 17 ' 2 cm 2 with an abundance consistent with observational constraints [70-72], using 
the formula of [69]. The decrease in (F) (z = 2.25) due to these systems is 0.014. We also 
introduce a correlation between these systems and the rest of the Lyman-a absorption by 
placing them only in regions where the optical depth is above a critical value To, such that 
the probability of r > ro is 1%. This is performed to explore the way that the damped 
absorbers may bias the measured correlation function, but their detailed impact depends 
on the specifics of this correlation. We leave for a future analysis a better modeling of the 
cross-correlation of damped absorbers and the Lyman-a forest. 

In the rest of the paper we refer to these contaminants as LLS/DLA. 

3.3 Adding forest metal contamination 

Absorption by intergalactic metals imprints correlated signal in quasar spectra on charac- 
teristic scales. These scales are set by the wavelength ratios of metal line transitions with 
Lyman-a and with each other. As a result, this correlated signal is a potential contaminant 
of large-scale structure measurements in the forest. In order to add forest metal absorption 
to the synthetic data we measure the strength of metals. We do this in a self-consistent 
manner by measuring metal line absorption in the continuum normalized BOSS Year One 
spectra (excluding spectra with known DLAs). We use a modified version of the method 
set out by [73] and measure these lines by stacking absorption lines between 104lA-1185A 
in the quasar rest-frame. We measure the signal associated with metal lines correlated with 
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Lyman-a or Si III in the forest. This is described in Appendix C, including a look-up table 
of flux decrements measured in the composite spectra. 

We introduce these metal absorption features to the full suite of mock data with no 
noise and no LLS/DLAs, on a pixel-by-pixel basis; we walk through the spectra and lower 
the transmitted flux by values from interpolation of this table, and then add Gaussian noise. 
As a result, we lower the mean transmitted flux by 0.003 in the mocks. This approach assumes 
that metal lines trace Lyman-a structure monotonically, and as such these ID correlations 
will add metal structure to our 3D analysis. Full line profiles are recovered by virtue of our 
easing of the local minimum requirement (metal absorption is added at the wings of Lyman-a 
profiles as well as the center). 

This technique does not provide a full description of metal absorption and, in particular, 
neglects the impact of scatter in metallicity and metal complexes. We test these mocks by 
stacking them in the same manner used for the BOSS Year One spectra. The metal corre- 
lations imprinted in the noise-free mocks are reasonably well recovered in these composite 
spectra. Composite spectra of the mocks with noise added (after metals have been intro- 
duced) show metal correlations that, where measurable, are up to 10% weaker than those 
seen in the observed data or added to the mocks in all cases except the Si III line seen in the 
strongest bin which is 30% weaker. This is caused by a combination of Gaussian noise and 
the probability distribution function of the flux. The noise distribution is symmetrical but, 
in the relevant regime, there are always far more pixels in the higher flux bin, which have 
weaker associated metal absorption. We conclude that metals added are probably an under- 
estimate of the average metal absorption associated with Lyman-a lines, but these results 
are sufficient for an exploration of the approximate impact of forest metals. It seems unlikely 
that LLS /DLA interlopers are able to produce the observed metal signal for the reasons given 
in [73], however, we shall explore this issue further in future publications. We also combine 
forest metals with LLS/DLA corrected mocks by introducing these high column lines after 
forest metals have been added. Hence only metals associated with the Lyman-a forest are 
included. 

3.4 Generating the spectra 

Once we have created an absorption field for every line of sight, we proceed to generate the 
actual spectrum for each quasar, multiplying it by the "continuum" of the quasar, i.e. the 
unabsorbed spectrum. 

We generate each quasar continuum shape taking into account the quasar redshift and 
using a mean rest-frame continuum and random PCA components derived from the low- 
redshift Hubble data [66]. The continuum is then normalized using the g magnitude of the 
quasar (taking into account the Lyman-a forest absorption). 

Since our data are sampled on precisely the same grid as the observed data, we can also 
introduce noise by using the actual values of noise from the observed data. We assume noise 
to be Gaussian with the absolute flux variance given by the pipeline, i.e., we do not correct 
the signal-to-noise ratio for differences between our randomly generated continuum level and 
the data level. 

Because the mocks were generated before the data analysis procedure was finalized, we 
have mocks only for quasars with redshift > 2.2, while the data analysis uses z q > 2.1. 



- 11 - 



4 Data analysis 



In this section we describe the analysis applied to both real and synthetic data. Briefly, 
the steps involved in this analysis start with co-adding the multiple spectra of each quasar. 
We then fit a model for the mean quasar continuum and mean absorption from the whole 
set of spectra. This is used to determine the fluctuation in the fraction of transmitted flux, 
6p, from the observed flux, in each individual spectrum over the Lyman-a forest range. 
The correlation function is then measured from these flux fluctuations. The information on 
the distribution of datapoint pairs and the measured correlations feed into the code that 
estimates the errors on our correlation function. With the estimated correlation function 
and error-covariance matrix, we finally proceed to estimate the parameters in our model of 
the correlation function, in particular the two bias parameters. The next subsections explain 
in detail each of these steps. 

4.1 Preparing the data 

All spectra in this analysis were obtained with the BOSS spectrograph on the Sloan 2.5-meter 
telescope at Apache Point Observatory [15]. The BOSS instrument follows the design of the 
spectrograph previously used in SDSS, but with improved throughput, wider wavelength 
coverage, reduced sky background from smaller fibers and a larger number of fibers. A full 
description of the instrument will be reported in a future paper [74] . 

Observations are performed using 15-minute exposures that are processed immediately 
with a simple data reduction package. These quick reductions provide estimates of data 
quality that are evaluated by a pair of observers at APO to insure roughly uniform depth 
over the full survey. The observers cease exposures when estimates of the accumulated 
SNR exceed a minimum threshold, leading to an average exposure time of approximately 75 
minutes per plate. Within 24 hours of completing a night of observations, the images are 
fully reduced into wavelength-calibrated, flux-calibrated, one dimensional spectra. Adapted 
from the SDSS-I/II spectroscopic reduction pipeline, version v5_4_14 of the BOSS pipeline is 
used to generate the spectra in this analysis. An updated version of the BOSS pipeline [75] 
is used for the first two years of spectra that will be publicly released in July, 2012. 

The modeling of the noise associated with the flux measured at each wavelength is 
particularly important for measurements of correlated Lyman-a absorption. Noise is char- 
acterized for each pixel in the raw images early in the Spectro-2D stage of the BOSS data 
reduction pipeline. After bias removal, the flux in each pixel is converted from analog digi- 
tal units into photon counts using CCD amplifier gains that were determined independently 
during the commissioning of the instrument. We generate a corresponding image to repre- 
sent the variance by taking the sum of square of the readnoise (measured in the overscan 
region of each quadrant) and the gain-corrected flux measured at each pixel for which the 
flux is positive. Pixels that fluctuate to negative flux values are assigned a variance from the 
readnoise only. The relative pixel-to-pixel response in the data is corrected with a flat-field 
image and the variance for each pixel is rescaled accordingly. For each pixel, we then invert 
the variance into units of inverse variance, and set the inverse variance equal to zero for all 
pixels that are contaminated by cosmic rays or known cosmetic defects. 

We transform each two-dimensional image into a set of one-dimensional spectra with an 
optimal extraction [76] . The flat-field exposure is used to determine the shape of the profile 
and the fiber-to- fiber variations in throughput. The arc lamp image is used to determine 
the wavelength solution, with linear corrections made in each exposure to match sky line 
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Parameter 



Value 



Total number of QSOs 14,598 (~ 160,000 in the complete survey) 

QSO sky density 10-25 deg~ 2 (typically ~15 deg -2 ) 

QSO redshift range 2.2 - 3.5 

Spectral resolution ~ 2.5A ~ 2.5/i _1 Mpc 

Pixel scale ~ l.oA ~ 1.0/t _1 Mpc 

Signal to noise ratio per pixel 0.5 - 10 (typically ~ 1) 

Exposure times Until SNR thresholds is reached, typically 5x15 minutes 

g-band magnitude 18-22 (typically ~ 21) 

Table 1. Typical observational parameters for BOSS quasars 



locations. The profile is fit to each spectrum in the two-dimensional image weighted by the 
inverse variance. Each extracted spectrum and corresponding inverse variance vector are 
corrected to ensure a uniform response curve as determined from the fiber-to-fiber variations 
in throughput mentioned above. All corrections to the inverse variance and flux vector 
preserve the SNR for each pixel. Sky subtraction is performed on each individual exposure 
using a model for background that varies with fiber position. The model is determined from 
the one-dimensional spectra extracted from fibers that were intentionally set to regions where 
no objects had been detected in the SDSS imaging program. The inverse variance of all fibers 
is rescaled by a factor to force the error distribution of all sky fibers at each wavelength to 
have a normal distribution with unit variance. In the Lyman-alpha region of the spectra, 
this typically scales the errors by a factor of 1.1, and this scaling is never allowed to decrease 
the errors. This effectively corrects for systematic uncertainties in the sky modeling and 
subtraction. We also mask spectra around bright atmospheric emission lines, such as Hg 
(5460A) and O I (5577A). 

At the final step in the Spectro-2D stage of the BOSS reduction pipeline, the spectra 
derived from individual exposures are combined into a coadded spectrum for each fiber. The 
Spectro-ID pipeline spectroscopically classifies and assigns redshifts to each object based 
upon a minimum chi-squared fit to a library of PCA templates derived from BOSS data. 

In this analysis, we use the spectra output from the Spectro-2D pipeline before the 
coaddition. Instead of using the resampled coadds from Spectro-2D, we co-add the data 
using the same fixed wavelength grid as the pipeline, but combining the flux values of the 
closest pixels in wavelength from the individual exposures. This ensures, in the simplest 
possible way, that the noise of the co-added data is independent from pixel to pixel, at 
the expense of a poor treatment of the small scale fluctuations in the data. Since we are 
interested in large-scale correlations, we are not concerned about these effects. In each pixel 
the data are coadded using inverse variance weighting. We apply a small correction to the 
final variance in each pixel, which typically increases it by less than 10%, to ensure that the 
inter-exposure variance in pixels is consistent with the noise predicted by the pipeline for 
individual exposures. 

Typical values of observational parameters for BOSS quasars used in this work can be 
found in the Table 1. An example of a BOSS spectrum after this reduction is shown in Figure 
4. This spectrum has a somewhat higher than typical SNR of quasars in our sample. 
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4.2 Continuum fitting 

The next step in analyzing the data is to fit the continuum. Our model for the flux measured 
at wavelength A in quasar i at redshift z% is as follows: 

/(A, i) = aj [A r /(1185 k)p C{\ r )F{\) [1 + S F (X, i)] , (4.1) 

where A r = A/(l + Zj) is the rest-frame wavelength. The term C(A r ) denotes the mean rest- 
frame quasar spectrum, which is multiplied by a power law Oj[A/(1185 A)] fc % where and bi 
are two parameters determined for each individual quasar. The power-law is included to allow 
for large-scale spectro-photometric errors that are present due to imperfect sky subtraction 
and calibration, as well as for any intrinsic variation in the quasar spectra. The term F(z) 
describes the mean absorption in the forest as a function of redshift. The entire product 
OjA 6i C(A/(l + z q ))F(z) is a well constrained quantity, but individual terms are correlated. 
For example, we can multiply m and F{z) by a certain factor and absorb it into C(X/(l+z q )); 
this is also true for the power-law parameter 6j. Consequently, some quantities (F, en, etc.) 
are determined only up to an overall normalization constant. 

For the purpose of determining our fit to the mean quasar continuum and mean trans- 
mission, and the parameters and bi, we calculate an overall likelihood of all the data 
without taking into account the correlations of the forest and the continuum fluctuations in 
different spectral pixels. This simplification allows us to write this simple expression for the 
likelihood function, 



log/: = EE 



[/(A,i)-a t [A r /(1185A)]^C(A r )F(z)] 2 ' 
2^X) l0g ^' A) 



+ const.. (4.2) 
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Here, <r(z, A) is the sum of the variances due to measurement noise and the intrinsic variance, 
i.e., it contains both variance due to continuum errors as well a variance due to small scale 
flux fluctuations. This quantity is required to be a function of the observed wavelength only 
in the region of the Lyman-a: forest (or equivalently, the Lyman-a forest redshift), while 
outside the forest we force it to depend only on rest-frame wavelength, which implicitly 
assumes that the dominant source of variation in that region is continuum errors. We do not 
necessarily believe his assumption is true, but this allows in one sense optimal weighting of 
different parts of the quasar spectrum outside the forest, i.e., prevents certain high- variance 
parts of the rest-frame spectrum from excessively affecting the solution. In the long term, 
this assumption will have to be relaxed. 

We restrict the rest-frame wavelength range used for this fit between 1041 A and 1600 A, 
and we also discard the data at an observed wavelength bluer than 3600 A. We parameterize 
C(A r ) using 20 equally-spaced spline nodes inside the forest (1041 A to 1185 A), and 20 
equally-spaced spline nodes outside the forest (1185 A to 1600 A), with one point fixed to the 
center of the Lyman-a emission line. We also use 8 spline nodes in redshift for the mean 
transmission fraction F, another 8 spline nodes to describe the scatter in the forest as a 
function of redshift, and another 20 spline nodes to describe the rms variations with rest- 
frame wavelength outside the forest. The full model used in continuum fitting is therefore 
described by 76 global parameters and 2 parameters for each quasar, Oj and b{. 

Since we are completely ignoring the correlations between pixels, the numerical values 
of (J 2 (i, A) and x 2 = ~ 21og£ have no physical interpretation. They should be thought of as 
merely assisting the fit to naturally incorporate the variances in the data. Therefore, it is 
not crucial that a 2 is parameterized by redshift inside the forest and rest-wavelength outside. 
A possible criticism of our fitting procedure is that redward of the Lyman-alpha emission, 
20 nodes of the spline fit are not enough to resolve the full structure of the mean continuum 
at the resolution of the BOSS spectrograph. However, these points are used only to assist 
with determining the values of a% and b% (the normalization and power law slope, relative to 
a mean continuum), and hence our determination of continuum must be only good enough 
to broadly describe the shape in that region. 

We determine all the parameters using a specially crafted Markov Chain Monte Carlo 
(MCMC, see e.g. [77]) procedure to find the maximum of C. The global parameters are 
first varied, while keeping aj and hi fixed, taking several hundred accepted samples (steps 
in the chain) of the MCMC. Next, the global parameters are fixed and we vary just ctj 
and hi for one quasar at a time, looping over all quasars. At this step, only the model 
for the particular quasar under consideration matters, and hence likelihood evaluations are 
extremely fast. We then go back to varying the global parameters and repeat the process 
for several accepted samples. This is iterated about 50 times (we have tested that using 
just 20 iterations negligibly affects the final results), and we finally take the sample with the 
optimal fit. The proposal function of the MCMC process learns the degeneracy directions 
by using the covariance of previously accepted samples, separately for the global parameters 
and the quasar-specific parameters. This MCMC process is simply used here as a functional 
maximizer, i.e., we take just the most likely sample. The process takes about 36 hours on an 
8-core workstation. 

4.3 Determination of 5f 

The next step is determining the flux variation 5f, reducing the resolution of our data and 
determining the weighting factors that are used for calculating the correlation function. First, 
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the continuum is predicted at each pixel of each quasar spectrum. Then the observed flux 
and continuum are averaged over four consecutive pixels, weighted with the inverse variance. 
This provides coarser pixels and an estimate of the noise on the value of flux in each rebinned 
pixel, with a reduced resolution. The typical length of these rebinned pixels is ~ 3/i _1 Mpc. 
We assume that the total variance in a coarse pixel is the sum of the noise plus the intrinsic 
variance in the flux field, which is a function of redshift. We then iteratively determine: 

• The intrinsic variance in the rebinned pixels, determined as a function of redshift by 
splining over 8 points in redshift. 

• A factor by which we multiply our estimate of the continuum in each quasar, which 
ensures that the flux averaged over the entire Lyman-a forest of the quasar equals 
the mean continuum averaged over the same interval. This average is calculated by 
weighting with the inverse total variance in each pixel. 

We find that by forcing the average mean flux over the quasar forest to match that of the 
average continuum, we decrease the variance in the correlation function estimate, although 
we also erase all modes along the line of sight of wavelength larger than that of the size of 
the forest. The way that this elimination of the mean flux variation in each quasar alters 
the measured correlation function can be easily modeled, as discussed in Appendix A. The 
upshot is that we can correct the predicted correlation function for this effect using a simple 
theory with one parameter Ar, which is the effective Lyman-a forest length. In the limit of 
Ar — > oo, the correction disappears, as expected. The value of Ar can be predicted from 
first principles, but we let it be a nuisance parameter as explained later. 

Finally, we find the average value of the fluctuation 5f at each fixed observed wavelength 
by averaging over all quasars, and we then subtract this average from all the values of 5f in 
each individual quasar. The average is calculated in wavelength intervals AA/A^a = 0.001. 
This procedure, which removes purely radial modes, helps remove some systematics of imper- 
fect sky subtraction or Galactic Call absorption [78] that occur at a fixed wavelength. The 
analysis of synthetic data demonstrates that this negligibly impacts the measured Lyman-a 
correlation properties. Finally, we make a cut on pixels in <5p that are over 6 — a away from 
zero in terms of the total variance. A handful of pixels were removed that way, but this cut 
makes negligible change in the resulting correlation function. 

4.4 Estimation of the correlation function and its errors 

We use the trivial sub-optimal estimator of the correlation function as simply the weighted 
average over pixel pairs, 



where the weights Wi are the total inverse variance (from both the measurement noise as 
well as the intrinsic variance due to small scale fluctuations in the forest). This estimator 
effectively assumes that the covariance matrix of the 5fi values can be approximated as 
diagonal. By construction it is an unbiased estimator, but it is not an optimal one. To avoid 
introducing contamination from the correlated residuals from continuum fitting errors, we 
include only pairs of points from different quasars when measuring the three-dimensional 
correlation function. We also measure the one-dimensional correlation function along each 
quasar, which is, however, only used for the error estimation. 




(4.3) 
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Throughout we neglect signal-noise correlations that arise due to the fact that the 
Poisson noise associated with counting of the photons hitting the CCD is measured from the 
observed photon count rather than the underlying (and unknown) true expectation value of 
photon count. In [9] it was found that this results in 1-2% change in the amplitude of the 
continuum fit, which induces a corresponding 2-4% change in the amplitude of the power 
spectrum, but they find no effect beyond this change in amplitude. The effect is likely to be 
smaller in BOSS data, compared to SDSS data since objects in BOSS are in average fainter 
and hence the amount of (not signal-correlated) Poisson noise coming from sky-subtraction 
is larger. 

Another potentially important source of spurious correlations is correlated noise on dif- 
ferent quasars observed on the same plate from imperfect sky-subtraction. We have found 
that including pairs of pixels from the same observed-frame wavelength produces strong con- 
tamination, which we think comes from residuals of the sky-subtraction. We therefore elimi- 
nate all pairs of pixels with an observed wavelength that are within 1.5 A of each other. The 
resulting correlation function is well described by the three-parameter linear cosmological 
model and thus we believe that we have managed to remove majority of the contaminat- 
ing signal. By applying this cut and ignoring pairs in the same quasar with thus discard 
information in purely transverse {fjL = 0) and purely radial (fi = 1) modes. 

We measure the correlation function ^_p(r, \i, z) in 12 radial bins up to r = 100/i -1 Mpc, 
10 angular bins equally spaced in fi = cos#, where 9 is the angle from the line of sight, 
and 3 redshift bins [z < 2.2, 2.2 < z < 2.4, z > 2.4). The redshift bins correspond to the 
redshift of the absorbing gas rather that the redshift of the background quasars backlighting 
the gas. Together with estimating the correlation function in individual bins, we have also 
calculated the weighted averages of r, \x and z for each bin (i.e., weighted in the same way as 
the £ measurement in that bin), which are used in the subsequent analyses. These averages 
correspond to the true bin position, although we find the difference with respect to the 
nominal bin centre to be small. When estimating the correlation function in limited redshift 
bins, we take into consideration all pairs whose mean redshift falls in the redshift bin under 
consideration. 

It can be shown that the error covariance matrix of this estimator is given by (see 
Appendix B): 

^ Epairs ijeA.pairs k,ldB w i w j w kWl(CFik^Fjl + ^Fil^Fjk) 

^AB = ^ ^ , (4.4) 

Z^pairs iJeA w i w 3 2->pairs k,l£B w kWl 

where A and B represent two bins in r, [i and z of the correlation function measurement and 
dj,obs denotes the actual observed covariance between the data points i and j . We stress that 
£,ij,obs is obtained from the data, so the contribution from noise and continuum fitting errors, 
metal absorption or any other possible systematic effects is automatically included (note that 
we do use overall measurements of the correlation function, including even the noise in an 
averaged sense, not pixel-by-pixel noise, however, the covariance is not dominated by pixel 
self-products so this detail probably is not important). 

Strictly speaking, this estimator for errors is true only at the level of 2-point contribution 
to the error covariance matrix; however, we show using synthetic data that it accurately 
reproduces the errors (although our mocks do not contain as much non-Gaussianity as we 
expect from the real data). 

Evaluating the sum in the numerator of equation 4.4 is a computationally daunting 
task: one would need to make a sum over all possible pairs of pairs, which is an 0{N^) 
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task for iV datapoints compared to the 0(N 2 ) for the estimation of the correlation function 
itself. Since in our case N ~ 10 5 , the extra effort is O(10 10 ) computationally more expensive 
when compared to estimation of the correlation function. However, only a small fraction of 
all possible 4-point configurations add significantly to the sum, namely those configurations 
of points (i,j,k,l) for which (i,k) and are close together (so that the corresponding 
values of £p are large), and for which the distance between the pairs (i, j) and (k, I) is in the 
relevant range over which we want to estimate the correlation function. We therefore use the 
following Monte-Carlo procedure: 

1. For each quasar, identify the list of quasars that are closer than the largest distance 
scale on which we are attempting to measure the correlation function (100/i _1 Mpc). 
We denote such quasar pairs neighbors. We additionally identify a subset of neighbours 
which are at distances closer than r cn = 30fo -1 Mpc. Such quasar pairs are denoted 
close neighbors. 

2. Select a random two pixels in the dataset, corresponding to neighbouring quasars A 
and B . These two points constitute a pair which is held fixed while we loop over 
all possible pairs of points (k, I). Pairs (k, I) are chosen from all possible pixels in close 
neighbors of A and close neighbours of B. For each such quadruplet and for the two 
possible pairing of points, determine which covariance matrix element they belong to 
and add to the corresponding element according to the Equation (4.4). 

3. Repeat the step 2 for Nmc times and then multiply the final sum with the ratio of all 
possible pairs (i, j) to the actual number of pairs considered JVmC' 

4. Divide this sum by the appropriate sum of weights for each covariance matrix bin. 

This process converges for about Nmc > 10 7 , in the sense that the resulting x 2 values 
change by less than unity. The reason why this Monte-Carlo procedure works is fairly in- 
tuitive: quasars are distributed randomly and hence there are no special points that would 
anomalously contribute to the variance of our estimator. We must sample the geometry of 
the distribution of our points well enough to cover all typical configurations, and by that 
time the error estimation converges. It is also important to note that our error estimation 
is complete in the sense that it is based on the measured two-point statistics in the field. 
For example, the continuum errors add a large scale contaminating signal along each quasar. 
These errors manifest themselves as larger correlations in the auto-correlation function of the 
quasar and ultimately result in larger errors on our measurements of the three-dimensional 
correlation function. 

Finally, the speed of this error estimation is drastically affected by the distance chose 
to denote close neighbors r cn . The method approximates that the contribution to the er- 
ror covariance from correlations beyond this distance is negligible. The speed for con- 
verges grows approximately quadratically with r cn . We have tested the procedure with 
r cn = 10, 20, 30, 50/i _1 Mpc and noted that the results converge at 20/i~ 1 Mpc. The 

convergence has been established by inspecting the best fit x 2 when fitting bias/beta param- 
eters (see section 4.5) and demanding that it changes by less than unity. We have therefore 
chosen r cn = 30/i~ 1 Mpc. 

1 Since quasars have varying number of pixels, this is not identical to selecting random neighbouring quasars 
A and B and then random pixels in these quasars. 
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Figure 5. The result of the \ 2 distribution exercise for radial bins at r > 10/i _1 Mpc, which is used 
in our cosmological fitting. The thick black histogram correspond to Monte-Carlos with realizations, 
while the thick red histogram is the distribution for the actual 30 realizations of synthetic data. The 
thin histograms of the same colors shows the effect of ignoring off-diagonal elements of the covariance 
matrix. The number of bins here is 300 (10 x 10 x 3). See text for discussion. 



We tested this procedure using synthetic data as follows. We averaged 30 synthetic 
realizations of the full dataset (with noise and continuum) into one single mean measurement, 
which was assumed to be the true theory. For each realization we calculated the x 2 ■, resulting 
in 30 x 2 values. Since we did not input the true theory, but the mean of thirty synthetic 
datasets, the mean x 2 , for ./V correlation function bins, is expected to be (x 2 ) = (1 — 1/M) N, 
where M = 30 is the number of datasets being averaged over. In practice, we Monte-Carlo 
this distribution by drawing random vectors from the same covariance matrix in a set of 30. 
The purpose of performing error estimation test this way is that it allows us to disentangle the 
accuracy of error-estimation from systematic effects that might affect the theory predictions 
(for example, when continuum fitting can add small systematic shifts in the mean predicted 
signal). Figure 5 presents the results of this exercise on the synthetic data with continuum 
and noise and shows that the error estimation is robust. 

We plot the structure of the error covariance in Figure 6 which shows that the neighbor- 
ing ji bins are somewhat correlated, but the neighboring r bins much less so. The neighboring 
redshift bins (not plotted) are even less correlated (at around 1%). Finally, we note the neg- 
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Figure 6. The structure of the error- covariance. Off-diagonal terms for the correlation matrix are 
plotted for one particular bin (denoted with a red point) on the plane of r and fi bins at the middle 
rcdshift bin. The covariance at the bin itself is unity, but we do not plot it to avoid unnecessarily 
stretching the color scale. The lower left corner is not present in the data due to a cut on observed- 
frame AA. Values of /z bins are linearly spaced between 0.05 (first bin) and 0.95 (tenth bin). The 
first four radial bins correspond to distances of 3/i _1 Mpc, 8/i _1 Mpc, 12/i _1 Mpc and 17/i Mpc. The 
remaining 8 radial bins are uniformly spaced between 25/i _1 Mpc and OS/i^Mpc. The reference point 
is 25/i- 1 Mpc, n = 0.55. 



ative correlation at high-//, large r arising due to removal of the mean in each quasar. 

The x 2 distribution test in Figure 5 shows that the error calculation is broadly correct, 
but is not very sensitive to inaccuracy in an important subset of the error matrix, and does 
not test for systematic inaccuracy in the measurement. To address these points we also show 
the result of another test of the error matrix. In the next section we discuss how we fit 
the bias parameters, and in the results section we will quote the tail probability that f3 is 
larger than a certain number based on MCMC chains. We test this procedure on our 30 
mocks by precisely the same test, namely fitting the bias parameters and then measuring 
the probability that the value of /3 is larger than the fiducial value used in creating the 
synthetic datasets. If our process is unbiased and correct, then we expect this probability to 
be uniformly distributed between zero and one. We plot the results of this test in Figure 7. 
Results of this exercise are quite encouraging - despite evidence for small systematic shifts 
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p(P > 1.58) 



Figure 7. Cumulative distribution of the fraction of samples with value of j3 above the fiducial 
value when fitting synthetic data. For correctly estimated errors, ignoring the upper prior on /3, this 
distribution should be flat between zero and unity, giving a straight line for cumulative distribution, 
plotted as thick red. Thin blue line is the same when off-diagonal elements in the covariance matrix 
are ignored. Faint grey lines are fOO realizations of 30 points drawn from uniform. 

in the inferred parameters (Table 2), we seem be able to successfully recover limits in the (5 
parameter for confidence limits used in this paper. 

Note that Figure 7 is a test both of the errors (whether the covariance matrix is correct, 
and whether the likelihood is fully described by a covariance matrix) and any systematic 
offset in the measurement. There are a few things that are known to be imperfect about 
our analysis, although not expected to be important: We make no attempt to include non- 
Gaussianity of the measured field in the covariance matrix estimation, i.e., to include the 
connected part of the 4-point function. The mocks are in any case only a weak test of 
this assumption, as they are not designed to have the correct 4-point function. On the 
other hand, [9] found that the errors in the ID power spectrum were within ~ 10% of the 
Gaussian expectation, and we are probing larger scales here, where we expect less non- 
Gaussianity. There are technical issues that could add up to biases in the pipeline. For 
example, the measured correlation function, which is used to estimate the errors, is assumed 
to be constant within radial bins, which is a poor approximation, especially for the most 
important bins at small separations. It is also true that the process of adding noise and fitting 
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and dividing by continuum could potentially "renormalize" the large-scale bias parameters. 
Careful exploration of these effects will be left for the future work. This will inevitably 
require many more mocks in order to clearly differentiate between biased error estimation 
and realization variance. In any case, Figure 7 shows that many possible forms of error in 
the analysis are unimportant. 

4.5 Fitting bias parameters 

We have fitted the bias parameters of the real and synthetic data assuming a fiducial flat cos- 
mology with n m = 0.27, o- 8 = 0.8, n s = 0.96, H = 71 km s^ 1 Mpc" 1 = 100 h km s" 1 Mpc" 1 , 
fib = 0.04, (where symbols have their standard meaning). We assume that the correlation 
function of 8 F is linearly biased with respect to the underlying dark- matter correlation func- 
tion, and that it follows the linear theory redshift-space distortions. We do not use the 
measured correlation at r < 10/i _1 Mpc for the fits where linear theory becomes a poorer ap- 
proximation. After Fourier transforming Equation (1.2), one derives the following expression 
for the redshift-space correlation function: 

£ F (r,(,)= ]T b 2 C e m Fe (r)P e (v) , (4.5) 

^=0,2,4 

where Pi are Legendre polynomials, 

C = 1 + 2/3/3 + 1/5/3 2 , (4.6) 

C 2 = 4/3/3 + 4/7/3 2 , (4.7) 

C 4 = 8/35/3 2 , (4.8) 

and 

£ Ft {r) = (2tt)- 3 J P(k)k e (kr)d 3 k , (4.9) 

with the kernels ki(x) given by 

ko(x) = sin(x)/x , (4-10) 
k2(x) = (sin(rr)x 2 — 3sin(x) + 3cos(x)x)/x 3 , (4-11) 
k<i(x) = (x 4 sin(x) — 45x 2 sin(x) + 105sin(x) + 10x 3 cos(x) — 105xcos(x))/x 5 , (4.12) 

and P(k) is the linear real-space power spectrum. 

We model the redshift evolution of the power spectrum as a simple power-law, 

P F (k, z) = b 2 P(k, z = 2.25) (j-^g) ■ (4-13) 

Therefore, the growth factor never enters our analysis - the redshift evolution is parametrised 
purely as power-law deviation from the power at z = 2.25. The only exception to this rule is 
when we fit the actual bias parameters in the three redshift bins independently (see Figure 
20), where bias parameters are measured in individual redshift bins with respect to the matter 
power at that redshift (but are assumed constant across a redshift bin). 

Whenever we fit for these parameters, we also include three Ar parameters describing 
the effect of removing the mean quasar component (as derived in Appendix A) at three red- 
shifts, as free nuisance parameters. Although these parameters can be determined a-priori, 
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we have found that fitting for them is easier, accurate enough (as tested on mocks), and 
worsens the errorbars on the inferred bias parameters by a very small amount when we 
marginalize over them. The resulting chains constrained values of Ar very weakly to be in 
the range between ~ 200/i~ 1 Mpc and 500/t -1 Mpc. 

We fit using an MCMC procedure. We put a flat prior on < j3 < 5 and unconstrained 
flat prior on 6(1 + (3), which are well-defined non-degenerate parameters. (Note that /3 can 
in general be less than zero, however, it is difficult to imagine a scenario in which this would 
happen.) This implies a non-flat prior on b. We also assume a flat prior on a. 

When we fit the data, we always evaluate the theory at each bin's mean redshift, [i value 
and radius. This approach allows for any linear variations of the underlying quantity across 
the bin. This is not a good approximation at the lowest values of r, but, as we show later, it 
is a good approximation for the bins that we actually use when fitting the bias parameters. 

4.6 Pipeline tests and synthetic data 

Our code has been extensively tested on the synthetic data. These tests have served two goals. 
First, they helped remove coding errors in both the data-reduction code and the synthetic 
data making codes. For this purpose, the two codes were written completely separately 
by two people. Second, these tests established the amount of systematic effects introduced 
by the data-reduction process. Our guiding principle in this exploratory work was that we 
should only correct for the errors that affect our results at the level of the current statistical 
errors in the experiment. Namely, as long as we are able to reproduce the input parameters 
from our synthetic data with the highest level of realism and noise at acceptable x 2 values, 
we pronounce the data reduction code sufficient and apply the same method to the observed 
data. More data might require better codes. 

While not always strictly enforced, we did attempt to follow the principles of blind 
analysis: after our basic reduction code was in place and gave no visually surprising results 
on the observed data, we worked exclusively on synthetic data in order to understand the 
reduction procedure and returned to observed data only when we were able to reproduce the 
input parameters from the synthetic data. (The analysis procedure was not, however, frozen 
at this point, as our mocks were not realistic enough to expect every problem to show up in 
them.) 

We begin by discussing the noiseless synthetic measurements of 5f- While these data 
are noiseless and there is no error associated with the continuum fitting, the final result still 
contains variance associated with sample variance. 

We have analyzed these datasets using three methods: i) by calculating the correlation 
function, ii) by first removing the mean component of each spectrum and then calculating 
the correlation function and iii) by also removing the purely radial modes as described in 
Section 4.3. The result of ii) and iii) are practically identical. We show the results of this 
exercise in Figure 8, where we have averaged the individual results of all 30 realizations in 
order to make this an essentially statistical error-free measurement, at least when compared 
to the statistical error in the real data. The purpose of this figure is essentially two-fold: to 
illustrate that, at the level of noiseless synthetic data, we can reproduce the theory that is 
being put into the simulations, and, more importantly, that the simple model in Appendix 
A appears to produce fairly good results (we will quantify these later). 

In order to proceed we need to understand the effect of adding various complexities 
to the data and how they affect the results. It is very difficult to judge the size of various 
effects by visually assessing how the points move on the plots, so we adopted the following 
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Figure 8. The correlation function plotted as a function of fi 2 for selected radial and redshift bins. 
Points are averages of 30 noiseless synthetic datasets with perfectly known continuum without any 
processing (red) and with mean and radial model removal (green). Red lines are the theory used to 
produce synthetic data, while green lines is the same after we corrected for the effect of mean removal 
using equations in Appendix A with Ar of 250, 300 and 350 h~ 1 Mpc for the three redshift bins. 

procedure. We run the parameter estimation of the synthetic data with the covariance matrix 
corresponding to one mock noisy dataset, but with the data vector corresponding to mean 
over 30 mock datasets. During the fitting procedure this yields x 2 values which are too 
small, but it allows one to see how central values move with respect to the size of the error 
bars. Note that our 3-parameter model (the bias b, the redshift-space distortion /3, and the 
redshift-evolution index a) is essentially the same model that has been used to create mocks, 
but with the important difference that the latter contains corrections to the linear theory 
redshift-space formula arising from matter power spectrum non-linearities, scale-dependence 
of the effective bias and fingers-of-God effect. 

In Figure 9 we show the results of the test described above, when fitting with the data 
corresponding to r > 20/i _1 Mpc. Figure 10 contains the same material but for the sample 
for r > 10/i -1 Mpc. We show the same information also in the tabular form in table 2. It is 
important to note that the inferred parameter constraints come with the usual caveats about 
Bayesian inference (projections and priors) and therefore we also quote the best-fit model, 
which should be at the position of the true theory, if we had infinite number of synthetic 
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Figure 9. Results of fitting the data averaged over 30 mock datasets together with noise covariance 
for a single noisy realization and using only datapoints with r > 20/i _1 Mpc in the fit. We show 
constraints on the b — ft plane and the probability histogram of a (which has negligible degeneracy 
with the other parameters). The input points are denoted by the red dot and the red line. The upper 
left plot is for the pure synthetic noiseless 6f values. The upper right plot is for synthetic data that 
have PCA continua and noise. The lower left plot is for the data that in addition to PCA continua are 
additionally painted with high column-density systems. The bottom right panel is for synthetic data 
to which metals have been added as described in §3.3 (with noise and continua but no DLA/LSS). 

datasets. 

The noise and continuum fitting do not strongly affect the inferred bias parameters. It 
is important to stress that this is a non-trivial conclusion: the continua in the synthetic data 
were simulated using PCA eigen-components, while the fitting procedure assumed a much 
simpler model for the continuum variations. This test also confirms the validity of the results 
of the Appendix A for the present purpose. There is some evidence that the continuum 
fitting may systematically decrease the bias, but the effect is smaller than the errorbars for 
one set of mocks. However, we find that the presence of the high column-density systems 
increases the expected bias and lowers the inferred value of f3. This is not an artifact of the 
pipeline, but simply the result of change in the theory. Forest metals have the same effect, 
but to a somewhat lesser extent. High column-density systems and forest metals definitely 
warrant further investigation, but this is beyond the scope of this article, and here we just 
note that these two effects can affect the inferred values of bias, /3, and, to a lesser extent, a. 
The inferred parameters are skewed (at a 0.5 — a level) when fitting with linear models using 
r > 10/i — 1 Mpc data due to incorrect assumption of fully linear redshift-space distortions, 
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Figure 10. Same as figure 9 but for fitting all data with r > lOh 1 Mpc. The axis limits were kept 
the same for easier comparison. 

especially at large values of fi. 

Finally, we note the surprisingly high degree of degeneracy in the 6-/3 plane. This 
degeneracy is present in all measurements of /3 (see e.g. [79]), but the high values of (3 
of the Lyman-a forest compared to typical galaxy populations makes it particularly acute. 
At P = 1.5, the power spectrum of purely radial modes has (1 + 1.5) 2 = 6.25 times more 
power than purely transverse modes and hence is measured much more precisely. If one 
measures just radial modes, the data would exhibit a perfect b{l + (3) degeneracy, which is 
only relatively weakly broken by the measurements of the low fi modes. 

5 Results with the observed data 

In this section we discuss results derived from the observed data. Figure 11 illustrates, 
qualitatively, our 3-d measurement of structure traced by the Lyman-a forest, and it gives 
an idea of the relevant scales. The inset panel shows the distribution of BOSS quasars in a 
patch of sky 140' x 70', corresponding to 170 h^ 1 Mpc x 85 h^ 1 Mpc (comoving) at z = 2.5 for 
our adopted cosmological model. The main panel plots the spectra of those quasars marked 
in the inset panel by circles; we have omitted other quasars in the field (shown by x's in 
the inset) to preserve clarity. In the main panel, observed wavelengths have been converted 
to redshifts for the Lyman-a transition and from redshift to comoving distance using the 
standard cosmology. Note the radical difference in scale of the two axes in the main panel: 
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Table 2. Results of parameter fittings for the average of 30 synthetic datasets and noise matrix for 
a single noisy measurement. The input values are b — 0.145, = 1.58, b(l + (3) = 0.375 and a = 3.8. 
Errorbars are 1 — a confidence limits except for j3 where we give the 1, 2 and 3 a error bars. 



the Lyman-a forest of a typical spectrum (highlighted in red) spans ~ 400/i~ 1 Mpc, while 
typical transverse separations of neighbouring quasars are a few h~ l Mpc up to ~ 20 h^ 1 Mpc. 

Next we illustrate our continuum fitting process in Figure 12 for the observed data and 
the synthetic data. The left-hand panel shows the mean continuum, while the right hand 
panel shows the mean absorption up to an arbitrary scaling factor (since it is completely 
degenerate with unabsorbed continuum level). It is important to stress that no information 
from the actual measured data went into this first generation of synthetic data. The continua 
in the synthetic data were created from the low-redshift Hubble Space Telescope observations 
and the agreement between real and synthetic data attests to the universality of the quasar 
mean continuum. We do not plot the measurement errors here, since the errors on the final 
correlation function are derived using the measured correlations in the data, and these include 
all the extra variance due to continuum errors. 

We illustrate the distribution of parameters ai and bi for real and synthetic data in 
Figure 13. The distribution for the observed data is considerably wider than that for the 
synthetic data. This is most likely due to the presence of the spectro-photometric errors in 
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Figure 11. Illustration of 3-d Lyman-a forest measurement. Inset panel shows the distribution of 
BOSS quasars in a patch of sky 170/i _1 Mpc x 85/i _1 Mpc (comoving size at z = 2.5). The main 
panel plots spectra of the ten quasars marked by filled circles in the inset. Other quasars are included 
in the analysis but omitted here for clarity. Wavelengths have been converted to equivalent comoving 
distance for Lyman-a, and the Lyman-a forest regions are highlighted in red. Note that the scales for 
the vertical and horizontal axes are much different - in fact the lines of sight are much closer together, 
relative to their length, than they appear here. 

the data, but one should not exclude, in principle, a wider intrinsic variation in the shape 
of the quasar continua. However, we have shown that fitting out these parameters does not 
affect the derived correlation function. 

In Figure 14 we plot the variance per pixel of Sp, after the observational noise has been 
subtracted. To translate this quantity into a theory observable, one must take into account 
the effects of the pixel size, the point spread function of the telescope, and the variance 
introduced by the continuum fitting. We see that at relatively high redshift the theoretical 
expectation obtained from the synthetic data sets agrees within ~ 10%, although there is 
an unexpected upward trend in the residual variance at redshifts below 2.4. This will be 
further discussed in §7. The solid red curve describes the data with our standard continuum 
determination, while the dotted curve shows the results of using an alternative continuum 
fitting method that will be described elsewhere [80]. 

Finally we proceed to the actual results on the correlation function. Figure 15 displays 
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Figure 12. The mean continuum fits (left) and mean absorption fits (right), in arbitrary units, after 
thirty iterations of the continuum fitting pipeline for both the observed data (blue) and synthetic 
data (red). The curves diverge for z < 2 and z > 3 as there are virtually no data in those regions (see 
right panel of Figure 3). 
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Figure 13. Distribution of parameters di and bi for the data (red) and synthetic data (blue). 

the measured correlation function as a function of radius up to 100/i -1 Mpc, together with 
the best fit theory, after averaging over angle (upper plot), and also on a two-dimensional 
grid of transverse (r±) and radial (ry) separation (two lower plots). These plots demonstrate 
the main two results of this paper: we have detected correlations in Lyman-a: absorption 
in our 3-dimensional measurement out to comoving separations of 60 /i -1 Mpc, and we have 
detected strong redshift-space anisotropy in the flux correlation function. Both results are 
established at high statistical significance, and our measurements are consistent with the 
predictions of a standard ACDM cosmological large-scale structure model augmented by a 
well motivated 3-parameter description of the relation between the Lyman-a forest and the 
underlying dark matter distribution. The parameter values — describing linear bias, linear 
redshift-space distortion, and redshift evolution — are in accord with a priori theoretical 
expectations, once we account for the impact of high column density absorbers and metal 
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Figure 14. The remaining variance per pixel in the forest as a function of redshift after the 
observational noise has been subtracted. The thick black line is the value of the default synthetic 
data, while the blue line is the synthetic data with LLS/DLA added and the green line is with the 
forest metals added. The thick red line is for the data, and the dotted red line is for an alternative 
continuum fit to the data [80]. 

lines. 

We note also that our parameter errors are substantially larger for measurements on 
the real data relative to the mocks (not counting j3, where the errors are highly dependent on 
the central value). Because the errors depend entirely on the measured correlation function 
in each case, this implies a substantially different correlation function between the two cases. 
We have not investigated carefully, but we suspect the difference is related to the large 
variance at low z in the real data (Figure 14), which is apparently not entirely compensated 
by boosted large-scale signal. 

In Figure 16 we show the actual measured data-points of £j?(r,[j,,z) in 30 panels for 
each bin in [i and z, together with the best-fit theory to guide the eye. We plot the square 
root of the diagonal elements of the covariance matrix as error bars, but measurements are 
correlated and therefore one should not attempt to evaluate x 2 by eye. The points in this 
figure are our actual measurements used in the fitting of the bias parameters. For easier 
visualization of the results, we also convert them to a few alternative forms. In Figure 17 we 
average over redshift bins and radial bins in some cases and plot the same datapoints as a 
function of /i. 

In Figure 18 we convert the ten fj, measurements, averaged over redshift, into measure- 
ments of the multipoles of the correlation function. We perform the same operation for the 
best-fit theory. Our results are in striking agreement with the predictions of the standard 
linear (i.e., extended Kaiser) theory: we see a strong monopole correlation function that 
is proportional to the linear theory matter correlation function and a strong detection of a 
negative quadrupole, which is the signature of linear theory large-scale redshift-space dis- 
tortions. (Note that the mean removal effect and the uneven redshift of individual points 
create a small i = 6 moment for the theoretical predictions, even though the i = 6 moment 
is exactly zero in pure linear theory.) 

These results confirm that we have detected correlations in the Lyman-a forest flux 
in three dimensions out to a much larger scale than in previous measurements, and that 
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Figure 15. Primary measurement results in visual form. Top plot shows the monopole of the 
correlation function, together with a best-fit two-parameter (6, ($) linear model. The bottom two 
plots are redshift averaged data plotted in the plane r± — rn, with each pixel plotted with the value 
corresponding to the nearest neighbor. The left panel corresponds to data, and the right panel to the 
corresponding best-fit theory. 
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we have detected the linear redshift distortions for the first time in the Lyman-a forest. 
This demonstrates that, on the large scales in which these linear correlations are measured, 
the dominant source of the Lyman-a forest transmission variations arise from gravitational 
instability of primordial mass fluctuations. 

To put this claim on a more quantitative basis, we fit the bias parameters as described 
in section 4.5. The results are plotted in Figure 19 and given in Table 3. The best-fit x 2 
values are 233 with 237 degrees of freedom when fitting with points r > 20/i -1 Mpc and 281 
with 297 degrees of freedom when using points with r > 10/i _1 Mpc. 

Compared to the simulations of reference [39], the data prefer lower values of (3 and 
higher values of bias, although the product b(l + 0) is of the right magnitude. We also see a 
somewhat lower evolution with redshift. The formal probability that /3>1.5is~3x 1CP 3 
(~ 1 x 10~ 3 for r > 10/i _1 Mpc fitting; both cases probably dominated by the MCMC sampling 
noise), and /3 > 1.0 about 10% (24% for r > 10/i _1 Mpc fitting). For fitting each redshift bin 
individually, the probabilities are 11%, 24% and 48% for j3 > 1.5 and 33%, 48% and 65% for 
(3 > 1.0 for the lowest, medium and highest redshift bin respectively. We see that the low 
value (3 for a single j3 fit is driven a lot by the lowest redshift bin. Our synthetic data sets 
show that metals and high column density systems (LLS/DLAs) can considerably lower the 
observed value of (3. Clearly, more work will be required to explain in detail the values of 
bias parameters that we find. 

To determine the distance to which we have formally detected correlations, we calculate 
the value of x 2 f° r a model with no correlations and compare it to the x 2 °f the three 
parameter model. Using this criterion, we have detected correlations up to a distances 
60/i _1 Mpc < r < lOO/i^Mpc at 3-a (Ax 2 > 9), and up to 70/i" 1 Mpc < r < WOh^Mpc at 
2-0" (A% 2 > 4). Similarly, we have detected redshift-space distortions with high significance. 
For r > 20/i _1 Mpc, setting (3 = while varying other parameters results in Ax 2 ~ 48, while 
using distance r > 10/i _1 Mpc one gets Ax 2 ~ 120. These large Ax 2 values imply very 
high statistical confidence. We quote this numbers as "over 5cr" to conservatively take into 
account potential imperfections in the error matrix. 

6 Systematic effects, cross-checks and data-splits 

In the previous section we have shown results of fitting the bias parameters to the measured 
correlation function. In this section we investigate how stable these results are against various 
divisions of the data. We always perform two basic cuts on the data: we never use pairs 
of pixels from the same quasar, and we never use pairs of pixels that are separated by less 
than 1.5 A. In addition, we have performed several tests by splitting the data in a variety 
of ways. Generally, for each split we measure the bias parameters and compare these to the 
full dataset. If the combined data were inconsistent with a subset of the data, this might 
indicate some unknown systematic error. For example, if what we are measuring are truly 
cosmological fluctuations, our results should be the same regardless of using bright or faint 
quasars. We perform the following data splits: 

• Perform the fitting excluding the lowest [x or the highest /i bin. In both cases, the 
central values shift and error bars increase, but the overall fit is not driven by either 
low and high /x values alone. 
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0.164 ±0.047 



r > 20/t- 1 
H > 0. 
n < 0. 



n ei _i_0.54 1.64 3.33 

u -° i=c 0.30 0.49 0.65 

n Q7_|_0.82 2.55 3.86 

u - y ' =c 0.39 0.62 0.78 

1 oq i 1.56 3.21 3.65 



0.305 ±0.017 

0.330 ±0.019 
0.383 ±0.033 



Mpc, 

1 

9 



0.200 ±0.021 
0.206 ±0.023 



n rq-l0.19 0.45 0.78 
U.DOiC 15 o.27 0.38 
n fit: i 0.21 0.48 0.80 
U - DO=C 0.16 0.28 0.38 



0.325 ±0.013 
0.341 ±0.013 



1.57 ± 1.77 
1.33 ± 1.60 



r > 20/j- 1 
104lA< A rest 
1120A< A rest 



Mpc, 

< 1120A 

< 1185A 



0.207 ±0.050 
0.118 ±0.032 



r > 20/i~ 1 Mpc, 
g < 20.5 
g > 20.5 



n Qft-l-0.30 0.96 2.66 
U.OOrE .i8 0.30 0.35 
1 co_i_1.54 2.98 3.35 



0.285 ±0.054 
0.311 ±0.024 



-9.31 ± 13.87 
3.40 ±2.91 



0.137 ±0.035 
0.162 ±0.035 



1 m-l-i-O 1 2 -94 3.82 
l.J.urc 45 o.7i o.90 

1 on_|_0.99 2.78 3.72 
i - zu=c 0.45 0.71 0.89 



0.295 ± 0.042 
0.357 ±0.022 



-9.10 ±7.55 
2.64 ±2.36 



Table 3. This table shows the results of the parameter fittings for the data and various systematics 
checks. All error bars are 1 — a error bars, except for the /3 parameter in which case we give 1,2 and 
3 — a confidence limits. 



• Perform the fitting as a function of separation. We fit for bias-(3 parameters in 5 radial 
bins. Results of this test are plotted in Figure 21. The value of a was fixed at the 
fiducial value of a = 3.9, but a is not degenerate with other parameters. 

• Perform the fitting as a function of redshift. We fit for the bias and f3 parameters 
independently in each redshift bin. Results of this test are plotted in Figure 20. 

• Split by Lyman-a forest range. We divided the total data into two segments 104lA — 
1120A and 1120A — 1185A. Note that each of these sets contain only a quarter of 
the original pairs (the remaining half being in the cross-correlations between the two 
halves). The measurement of a is very poor using the short wavelength end because 
there is a dearth of pixels at relatively high redshift in this case. 

• Split by quasar brightness. We divided by g-band magnitude into quasars brighter or 
fainter than 20.5. Note that each of these sets contain only a quarter of the original 
pairs. The measurement of a is very poor using the bright quasars because there is a 
dearth of pixels at relatively high redshift in this case. 

None of these tests resulted in a detection of a significant systematic effect. Results of 
this exercise are also presented in Table 3. Note, however, that these splits are not entirely 
satisfactory, as the differences are hard to test at precision approaching the overall precision. 
E.g., in the case where [i is restricted to be > 0.1, 6(1 + (3) changes from 0.336 ± 0.012 to 
0.325±0.013. This seems like a small change, but on the other hand it is almost 1 a when only 
a small fraction of the data is removed. If we think of fi > 0.1 and fx < 0.1 as two independent 
measurements of 6(1 + j3), the implied difference between the values is 0.074 ± 0.034, i.e., a 
2.2 a difference. There is no compelling reason to believe this is evidence for a systematic 
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error (this was chosen out of several possibilities as an example of a big difference), but it 
highlights the fact that a split can differ by ~ 6 a by the scale of the full-sample errors (i.e., 
0.074/0.012) yet still not be decisive evidence for a systematic problem, i.e., big effects could 
hide in this kind of test. In the cases where the sample of quasars is split, the errors often 
expand catastrophically due to a reduction in cross-correlation pairs, so in the end the tests 
are not actually very powerful. In the future, we may try to squeeze out more powerful tests 
by, e.g., fitting for some simple parameterized dependence of b and (3 on quasar magnitude 
(or other properties), continuing to use the full data set in the fit. 

7 Discussion 

We measure the redshift-space distortion parameter (3 to be between 0.44 and 1.2 at the 
95% confidence level. This value is lower than the theoretical prediction from numerical 
simulations of the Lyman-a forest in [39], (3 ~ 1.47, with small error bars for the particular 
model that was analyzed there. We have shown here that forest metal contamination and 
LLS/DLAs may help explain this discrepancy: in the simple model we have adopted to 
include LLS /DLAs in our synthetic data, these systems alone lower the fitted value of j3 from 
1.58 to P ~ 1.33, while forest metals alone lower it to /3 ~ 1.35. The two effects together can 
lower it to f3 ~ 1.09 (we used /3 = 1.58 to start here, instead of 1.47, because we had not yet 
adjusted the prediction of [39] to reflect a modern cosmological model). Note that we have 
removed by-eye-identified DLAs from the data set, so any effect must be coming from ones 
that are missed, due to low column density and/or noise in the data. 

When we include identified DLAs in the analysis (Table 3), b and ft change in the direc- 
tion predicted by the mocks, but it is actually quite difficult to estimate whether this change 
is consistent with a small or large fraction of the systems relevant to b and f3 being included 
in this identified sample. Since optical depth is additive, the observed flux fluctuations are 
5f(x) = (1 + 5f(x))(1 + 5d(x)) where 6p(x) is low density forest and Sd(x) high density. 
At this point one might be tempted to approximate this as 5f(x) ~ 5f{x) + Sd(x), which 
makes it straightforward to estimate b and /3 for one of the fields given the other two. This 
would make it possible to interpret the mocks as predicting b and (3 for the DLA/LLSs, and 
then estimate exactly how, given that model, the observations with DLA/LLSs should be 
related to ones without them. Unfortunately, as shown by [81, 82], this linearized calcula- 
tion is mathematically nonsensical. A careful study of a term like (5f(x)5d(x)5f(ii)) shows 
that this is not generically smaller than {5D(x)SF(y)) - a naive Taylor series approach is 
not valid because the product 5f(x)5d(x) applies locally, i.e., at a small-scale point where 
fluctuations are not small. Pursuing this calculation to next-to-leading order shows that 
gravitational evolution generically leads to a modulation of this local product by large scale 
density modes, so the composite field 5f(x)6jj(x) appears as a standard linearly biased field 
on large scales. 

Somewhat puzzlingly, adding real DLAs increased our a result substantially (Table 3), 
while the mocks actually predict a reduction. Clearly there are some imperfections in our 
treatment, although sub-DLAs which are included in the mocks but not identified in real 
data could in principle account for these oddities. We emphasize here that the degree to 
which LLS/DLA and metal lines may lower the value of (3 ought to depend on the way in 
which these systems are inserted in our synthetic data sets, and on the way they are cross- 
correlated with the Lyman-a forest. The model we have adopted to insert the DLA/LLS 
systems in our mock spectra should only be considered as an illustrative example of their 
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plausible effect. An observationally calibrated physical model of the distribution of these 
systems will be required before reliable predictions can be made of their impact on the value 
of p. 

As discussed in the introduction, simulations with lower resolution/smoothing scale 
than [39] (including [39] 's own low resolution simulations) find considerably lower value of 
(3 ~ 1.0 ([40], Martin White private communication), so it is possible that a low f3 that 
survives other explanations is an indication of smoother-than-expected small-scale gas, or a 
flaw in the HPM modeling of pressure used in [39] . 

The second parameter that we measure is the bias. Bias is, of course, completely 
degenerate with the assumed value of erg, which at the moment is known to about 3% in the 
simplest LCDM model [83]. The parameter that we are really measuring from the Lyman-a 
forest observations is the product ba&{z ~ 2.25). We assumed a value of cr&{z = 0) = 0.8 and 
our inferred bias varies as the inverse of ag(z = 2.25) (where we mean as loosely - really, the 
linear power on the scale that we are measuring). Our bias is constrained to be between 0.16 
and 0.24, a value which is considerably higher than the b ~ 0.13 obtained in [39] and [41] (the 
latter verified by later simulations of Martin White with higher resolution). [40] did obtain a 
higher value b ~ 0.18 from very low resolution simulations, but the numerical smoothing in 
these simulations is almost certainly much larger than any physical smoothing of the IGM. 
These theoretical numbers are for the uncontaminated forest; metal contamination associated 
with forest absorption negligibly affects bias, but LLS/DLAs can raise the bias considerably, 
by some 20% in synthetic data. The effect of including or excluding quasars marked as DLAs 
is about 10% on the bias in our data. This is consistent with the fact that our non-DLA 
flagged sample is still likely to be contaminated with high column density systems at some 
level (see §2). Regardless of the cause, the higher-than-expected bias may improve BAO 
detection by creating a higher-than- forecast signal. 

There is some evidence that our lowest redshift bin is the most problematic. From 
Figure 20 it is clear that it contributes a lot to the overall signal on (3 and that it drives 
the low value of /3 observed here. Moreover, Figure 14 indicates a greater variance than 
expected at the lowest values of z (this expectation is an extrapolation of the trend in the 
SDSS observational measurement of [9], not necessarily a simulation prediction). The cause 
of this enhanced variance is not yet understood. The bias evolution parameter a is similarly 
smaller than expected, indicating extra power at low z on large scales, although this is only 
significant enough to be suggestive. At current sensitivity, the data are consistent with a 
constant f3 and bias, which suggests that any contaminant that is affecting the large-scale 
correlation is itself a tracer of the large scale structure. 

Irrespective of these uncertainties on the values of b and f3 and their implications for the 
physics of the Lyman-a forest, which will need to be further investigated in the future, one 
main conclusion stands out from this work: the correlation function of the Lyman-a forest 
on the scales of 10 to 60/i -1 Mpc bears the signature of redshift distortions consistently with 
the growth of linear density perturbations by gravitational instability. The detailed physics 
of the Lyman-a forest may still be influenced by other processes on smaller scales, such as 
galactic winds and outflows from quasar jets, but on the large scales examined here, linear 
gravitational evolution must be the principal process at work. 

What are the main improvements that are desirable for the next iteration of the analysis 
of the BOSS Lyman-a forest data? We have reasons to believe that our analysis is sufficient 
for the goals in this paper, mainly because of the good fit we obtain to the standard theoretical 
model and the tests performed in §6. However, the next iteration may require a more 
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sophisticated analysis, especially to better understand the lowest redshift bin. First, we need 
to remove the potential sources of systematics by subtracting the 'signal' measured red-ward 
of the Lyman-a emission line. This should eliminate potential contamination from low- 
redshift metal lines as well as any systematics associated with, e.g., imperfect sky subtraction. 
With the present quasar sample we have performed this test; however the sample used here 
did not include low- 2; quasars that would be required to do this subtraction in the lowest 
redshift bin. The two higher redshift bins did not show any measurable deviation from 
zero. Note, however, that such methods will not solve the problem of metals lines that arise 
exclusively (or almost exclusively) in the forest, as described in § 3.3. Second, the continuum 
fitting could perhaps be improved by going beyond a fixed continuum model (e.g. [80]) and 
more thoroughly investigating the spectro-photometric errors in the data. Third, our results 
have shown that there is a real need to better understand and filter out the high column 
density and metal-line systems systems present in the data. These should be identified, 
possibly using the absorption features red-ward of the Lyman-a emission line, and either 
corrected or removed from the data. Only the approximate impact of forest metals has 
been explored here. As the survey grows, measurements of metal clustering and scatter in 
absorption strength will be included in the analysis. This will also provide greater precision 
and, perhaps, sensitivity to weaker metal lines. We have not explored in this paper the impact 
of metals associated with LLS/DLAs, and this is something we intend to address. Fourth, 
additional physical effects might complicate the biasing of the observed field with respect to 
the dark matter, such as temperature and ionization fluctuations in the intergalactic medium 
[44] (although see [84] for a possible way to constrain these from the 1-dimensional Lyman-a 
forest). These do not affect the measurement per se, but they do affect its interpretation. 
Finally, the method for calculating the correlation function (or power spectrum) should be 
made more optimal. This should include a better treatment of the evolution of the flux 
across the forest and an appropriate form of inverse variance weighting. The full problem 
is computationally intractable, but one could apply the inverse covariance weighting on a 
per quasar basis, an approximately optimal weighting suggested in [55, 56], or do the full 
problem on coarse pixels. 

One of the main ultimate goals of the measurement of the Lyman-a forest correlation 
function is to infer the angular diameter distance Da{z) and the Hubble constant H[z) at 
the observed redshifts, using the position of the BAO peak. However, even at the smaller 
scales at which we have made our measurements here, cosmological constraints might be 
obtainable on the product Da(z)H(z) through the Alcock-Paczyhski method [85-87]. To 
illustrate the statistical potential of this test, we have attempted to fit the observed data 
assuming an Einstein-de-Sitter cosmology. If we simply rescale the radial and transverse 
distances, keeping a constant form for the linear theory power spectrum, spurious higher 
multipoles appear in the redshift-space correlation function. This results in the best fit x 2 
degrading by A% 2 ~ 10 when using only r > 20/i~ 1 Mpc, and by Ax 2 ~ 21 when using 
r > 10/i _1 Mpc. This procedure is not a fully geometrical form of the Alcock-Paczyhski test, 
since we have assumed that the real space correlation function has the shape predicted by 
ACDM, but it shows that our data already have enough power to detect any large deviations 
from the spacetime metric of a flat, A-dominated universe. 

A much stronger change appears if, in addition to changing distances, we also change 
the underlying theory. The CDM correlation function for f2 m = 1 passes through zero in 
the radial direction at ~ 28/i~ 1 Mpc, clearly at odds with our data (see e.g. Figure 15). The 
best-fit x 2 h 1 this case increases by Ax 2 ~ 48 when fitting r > 20/i~ 1 Mpc and by Ax 2 ~ 88 
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when fitting with r > lO/i Mpc. We caution that this procedure has not been tested 
with synthetic data; fitting cosmological parameters goes beyond the scope of this paper. 
This shows, however, that the shape of the correlation function may contain substantial 
cosmological information in addition to the BAO feature if systematic errors can be well 
controlled. 

8 Conclusions and Prospects 

For more than a decade, 1-dimensional analyses of the Lyman-a forest have provided a 
powerful quantitative tool for probing structure in the high-redshift universe. The BOSS 
quasar sample makes it possible, for the first time, to treat large-scale structure in the 
Lyman-a forest as a truly 3-dimensional phenomenon. Although this first-year BOSS quasar 
sample is only 10% of the anticipated final sample, it is already several times larger than 
the largest previous sample used for cosmological analysis of the Lyman-a forest [9]. It is 
similar in size to the entire sample of z > 2.1 quasars from SDSS-I and SDSS-II [58], and the 
order-of-magnitude higher surface density of BOSS quasars makes it a much more powerful 
sample for 3-dimensional measurements. We have achieved high-significance detection of 
the angle- averaged flux correlation function out to comoving separation of 60 h^ 1 Mpc, and 
the shape of this correlation function agrees well with the predictions of a standard ACDM 
cosmological model. 

Our measurements show the clear signature of redshift-space anisotropy induced by 
large-scale peculiar velocities. The agreement of the observed anisotropy with the linear 
theory prediction of the extended Kaiser model confirms the standard model of the Lyman- 
a forest as structure that originates in the gravitational instability of primordial density 
fluctuations [30-33, 88]. 

We have fit our measurements with a 3-parameter model that describes the linear bias 
of the forest (b), the redshift-space distortion (/3), and the redshift-evolution of the corre- 
lation amplitude (a). Our estimated parameter values are within the range of theoretical 
predictions, though the value of /3 appears somewhat low (see §7). Our synthetic data tests 
suggest that this low /3 may be a consequence of high column density systems (LLS/DLA) 
and metal- line absorption within the forest. Statistical errors estimated internally from the 
data agree well with external estimates based on the synthetic data sets, which suggests 
that we have identified any observational or physical effects that have a large impact on our 
measurements. 

The tests in §7 show that assuming either an f2 m = 1 spacetime metric or an f2 m = 1 
CDM matter power spectrum leads to substantially worse agreement with our measurements. 
However, we have not attempted to derive cosmological parameter constraints, instead fitting 
values of b, f3, and a assuming an underlying ACDM cosmology. Previous studies using the 
1-dimensional flux power spectrum have inferred the slope and amplitude of the matter 
power spectrum by using cosmological simulations to predict the bias of the flux power 
spectrum (including its scale dependence) from first principles. Even after marginalizing 
over uncertainties in the IGM equation of state, these studies yield valuable cosmological 
constraints (e.g., [4-7, 18, 20] 

The BOSS Lyman-a forest measurements will allow these tests to become much more 
powerful. The measurement of the 1-dimensional power spectrum will itself become much 
more precise with the large BOSS data sample, and division of the data set into many sub- 
samples of redshift, data quality, and quasar properties will allow careful cross-checks for 
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systematic errors. Much stronger constraints can be obtained from three-dimensional mea- 
surements, because of the additional information contained in the cross-correlation of parallel 
lines of sight and because they allow for strong tests based on redshift space distortions and 
the cosmological dependence of the angular diameter distance and expansion rate. Fully ex- 
ploiting these data will require considerable analysis of the systematic effects of the DLA/LLS 
and metal-line absorption and of additional physical effects on the correlation function, such 
as those due to variations in the ionizing background or in the temperature-density relation 
induced by helium reionization [44]. However, BOSS data will provide many measurements 
with which to constrain these models and test for observational or theoretical systematics. 

The design goal of the BOSS quasar survey is to measure the angular diameter distance 
Da{z) and Hubble parameter H(z) at z ~ 2.5 from the BAO feature in Lyman-a forest 
clustering [24]. Forecasts using the formalism of [55] imply la constraints of 7.7% and 
3.0% on these two quantities, respectively, from the full survey. These errors are strongly 
correlated (similar to the 6-/3 degeneracy found in this paper), so it is more meaningful to 
quote the forecast error on an overall distance scale dilation factor, which is 1.9%. Our 
present measurement of clustering on sub-BAO scales is based on 10% of the full BOSS 
data sample and on first-pass versions of the spectroscopic reduction pipeline and Lyman-a 
forest analysis procedures. The good agreement that we find with theoretical expectations 
reinforces the promise of the Lyman-a forest as a tool to map the high-redshift universe, to 
measure its expansion via BAO, and to thereby constrain the origin of cosmic acceleration. 
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A Appendix: Removing the mean of the forest 



Consider a toy model in which a quasar has a constant continuum and we measure flux in 
pixels i = 1 . . . N: 

fi = f(l + 6i + €i), (A.l) 

where <5j is the underlying fluctuation field and our measurement error (we can rescale it 
by / without loss of generality). By fitting a continuum to the set of points and estimating 
the flux contrast, we actually estimate 8i as: 



s , = fi ~ Ek fi = Si + et-N-^^Sk + ek 



« \ Si + q - iV- 1 + (l - N- 1 ^(5 fe + . (A.2) 

In the approximation, we have used the fact that the mean fluctuation across the forest 
is much less than unity. Taking expectation value over noise, one gets 



= 5i- n- 1 J2 s k- tiN- 1 Y, 6k + N ~ 2 £ s * s i - N ~ 1(j i + N ~ 2 E °l ( A - 3 ) 

where we assumed diagonal noise vector (e^e;) = SEa^ (neglecting the terms from the de- 
nominator above). 

This means that, after fitting for mean continuum, the estimator is not unbiased any- 
more and can, in principle, lead to change in large-scale bias 2 . This effect is the 1-dimensional 
analogue of the integral constraint for the correlation function that is a direct consequence 
of k = mode being forced to be zero. Our pipeline forces not only the overall mean of fluc- 
tuations to be zero, but also mean in each individual quasars to be zero, effectively imposing 
an integral constraint on the per-quasar as well as per-survey basis. 

We proceed to look at cross-correlation between two adjacent quasars with respective 
flux contrast measurements 5' A and 5' B which we, for simplicity, assume to have forests of 
equal length. Then, the trivial correlation function estimator gives 



$ A OHW>-^£(WO+W °- ^relators (A.4) 

k k I 

(neglecting the terms from the denominator above). 

Therefore for a given pair of pixels, the process of removing the mean component from 
the quasar results in measuring the true correlation function minus the appropriately averaged 
correlation function averaged over pixel pairs in the quasar. 

In practice, one does not need to simulate the full geometry of the survey to calculate 
this effect; it is sufficient (as proven by tests on synthetic data) to assume that a typical 
correlation function is averaged over some distance Ar in the positions of both quasars: 

2 Note that this calculation is incomplete as we are missing the third-order terms which contribute at the 
same order when computing a correlation function. 
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^ rAr/2 j /> r-Ar/2 

^F( r ±) r ||) = ^F(»*±,r||)-2— / dri£ F (rx,r\\+ri)+-j—2 dr 1 dr 2 ^F(r±,rn+ri-r 2 ) 

L\r J-Ar/2 Ar JJ-Ar/2 

(A.5) 

where note that the ry inside the integrals is not derived from Eq. A. 4 - we are approximating 
the distribution of relative quasar redshifts by assuming that all quasars are at the same 
redshift in the ru = case, and then assuming that the slightly different weightings of 
alignments for non-zero rii lead effectively to a shift in alignment by exactly ru . 

We use this simple formula to account for the overall affect of removing means from all 
spectra, with a single fitted Ar in each redshift bin, even though generally we could do a 
more careful spectrum- by-spectrum calculation using the pixel-pair weights, because mocks 
show that this approximation is good enough. 



B Appendix: Errors of the trivial estimator 

In this work we use the trivial correlation function estimator (we drop the subindex F in the 
flux correlation function in this Appendix to reduce clutter): 

^ ) = Tn#«U*>*°i*f**U % (R1) 

2-/ pairs i,j w i w j 

It is clear that the expectation value of this estimator is the true correlation function, 
regardless of which weights Wi are used: 

Iti W ^pairs i,j w i w j \$fi$fj) , x x , zJpairs i,j w i w j , . , , 

Z^pairs i,j w i w j Z^pairs i,j w i w j 

(where the correlation function is assumed to be constant within a bin). In other words, the 
estimator is un-biased. The co-variance of this estimator is given by (we use A and B to 
denote two (r, pt) pairs): 

S pairs i,j£A w i w j ZJ pairs i,j€B w i w j 

The 4-point term in brackets can be expanded using Wick's theorem, which yields three 
terms, the first of which is separable, giving 

Z^pairs i,jeA w i w j Z^pairs i,j£B w i w j 

and hence the covariance matrix of errors is given by 

n Itl A\i(T>\\ tt WtlT>\ ^pairs j,jeA,pairs k,l£B W i W j W k w l(CikCjl + ZilZjk) 

Cab (Z(A)Z(B)) - ^{A)C(B) = — ^ — ^ — ^ ^ . (B.5) 

Z^pairs i,jeA w i w j Zvpairs i,j€B w i w j 

We use the raw measurement of the correlation function from the data for £ here, including 
the noise contribution to the diagonal (in an averaged sense, not pixel-by-pixel). Strictly 
speaking this estimator is true only for Gaussian fields and the corrections to it are of the 
order of bispectrum. 
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C Appendix: Measuring forest metal absorption 

A modified version of the approach described by [73] is used to measure metal absorption 
associated with the Lyman-a forest. We eliminate the requirement that stacked pixels be a 
local flux minimum, which was intended to limit the stacking of wings from stronger lines. 
This is preferable because our goal is not to measure metallicity, but to measure a signal in 
order to reproduce it. We combine the individual spectra using the arithmetic mean with 
a 3% outlier clipping and continuum fit to correct for uncorrelated absorption giving us a 
composite transmitted flux, F c , of stacked systems. 

We select pixels to stack by virtue of their normalized flux F n = F/F, where the mean 
transmitted flux, F, is determined using the method set out in § 4.2. Seven composite spectra 
were produced with F n < 0.4 (above which the metal signal was negligible). We retain 
the requirement from [73] that pixels be 0.5a from saturation (a standard choice in pixel 
optical depth techniques) in order to obtain a clean measure and minimize the contribution 
of LLS/DLAs. In each of our composite spectra, we measure F c at line center for 7 metal 
transitions: Si II (1193A), Si III (1207A), N V (1239, 1242A), Si II (1260A), O I (1302A), 
Si II (1304A). Since we do not set a requirement that all stacked absorption be Lyman-a , 
we allow that some lines in the composite arise from stacking metal lines. The only resolved 
contamination of this sort arises from stacking Si III lines. We have measured F c at line 
center for 7 wavelengths where this 'shadow' signal would be present (see [73] for details). 
We include this shadow signal in the construction of mocks as if it came from, e.g., an 
additional metal line at ~ 1225A, even though this implies that the original underlying mock 
field represents not just a hydrogen field, but also an identical Si III field unphysically offset 
by ~ 20 h~ l Mpc in the radial direction. We don't think this affects our results. 

This process provides a look-up table of 7 Lyman-a line strengths and normalized flux 
decrements {D c = 1 — F c ) at 14 fixed spectral locations. This is shown in Table 4. It should 
be noted that we do not limit ourselves to high-significance lines (as in dedicated metal line 
studies). Instead we include all metal lines that may introduce contamination, and are seen 
in some composite spectra. We constrain the metal signal to be positive, which biases our 
results upward (e.g., if a line did not really exist, we would on average add one), but we do 
not think this affects our results, because generally the non-detections have fairly tight upper 
limits. We ignore the fact that N V is a doublet and the two lines are treated independently. 
The very low absorption seen in the weaker N V renders it negligible and is only included for 
completeness. 

No significant evolution in the metal line strengths is seen, but as the size of the BOSS 
survey grows we will revisit the analysis. Here we use the full redshift range used in this flux 
correlation analysis. 
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Figure 16. Measurements of the observed data. Columns correspond to the three redshift bins 
we use, with increasing redshift from left to righlg _The ten rows correspond to the ten bins of /i, 
increasing from top to bottom. In each plot we show the measured £p as a function of separation for 
that particular redshift and fi bin. The best-fit linear theory is over-plotted to guide the eye. The 
measurements are correlated and hence one should not evaluate ll \ 2 by eye" . 
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Figure 17. Measurements of the observed data. Each panel corresponds to redshift-averaged data 
at a certain radius as a function of ijl. We also plot the best-fit linear model to guide the eye. 
Measurements are correlated and hence one should not evaluate "x 2 by eye" . 
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Figure 18. Results of Figure 16 converted to multipoles. The four panels correspond to the redshift- 
averaged monopole, quadrupole (top row), hexadecapole and I — 6 moment (bottom row). Lines are 
best-fit theory. 





0.15 0.20 0.25 0.30 



Figure 19. Fits to the real data. The upper panels correspond to the data fitted using points with 
separations r > 20/i _1 Mpc while the lower panel is for fits using points with r > 10/i _1 Mpc. The 
left-hand-side plots is for the default dataset, while the right hand side plot is for data that include 
quasars flagged as harboring DLAs by the FPG. The red-point corresponds to the value that was used 
in the creation of the synthetic datasets. 
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Figure 20. In this plot wc show fits to the real data when the b and j3 parameters are allowed to be 
a function of redshift. We plot the 1,2 and 3 a error bars. The bias in this plot is with respect to the 
fiducial cosmological model with as = 0.8 at the redshift of interest, therefore the numbers cannot be 
directly compared with the fitted a,b parameters. We plot the value of /? determined from the overall 
fit as a black solid line. All fits in this figure are limited to r > 10/i _1 Mpc. 
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Figure 21. Results of fits of the real data to the b and /3 parameters when they are allowed to 
be a function of scale. We plot constraints on 6, j3 and b(l + (3), and their 1,2 and 3 a error bars 
measured from the corresponding percentiles of MCMC chains. Note that /3 and 6(1 + 0) have flat 
priors on them and that & is a derived parameters. The solid black thick lines correspond to the best 
fit parameters determined from fitting r > 10/i~ 1 Mpc points. 
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